4 research outputs found

    GPU NTC Process Variation Compensation with Voltage Stacking

    Get PDF
    Near-threshold computing (NTC) has the potential to significantly improve efficiency in high throughput architectures, such as general-purpose computing on graphic processing unit (GPGPU). Nevertheless, NTC is more sensitive to process variation (PV) as it complicates power delivery. We propose GPU stacking, a novel method based on voltage stacking, to manage the effects of PV and improve the power delivery simultaneously. To evaluate our methodology, we first explore the design space of GPGPUs in the NTC to find a suitable baseline configuration and then apply GPU stacking to mitigate the effects of PV. When comparing with an equivalent NTC GPGPU without PV management, we achieve 37% more performance on average. When considering high production volume, our approach shifts all the chips closer to the nominal non-PV case, delivering on average (across chips) Ëś80 % of the performance of nominal NTC GPGPU, whereas when not using our technique, chips would have Ëś50 % of the nominal performance. We also show that our approach can be applied on top of multifrequency domain designs, improving the overall performance

    Project of a quantum coprocessor for crytographic algorithms optimization.

    No full text
    A descoberta do algoritmo de Shor, para a fatoração de inteiros em tempo polinomial, motivou esforços rumo a implementação de um computador quântico. Ele é capaz de quebrar os principais criptossistemas de chave pública usados hoje (RSA e baseados em curvas elípticas). Estes fornecem diversos serviços de segurança, tais como confidencialidade e integridade dos dados e autenticação da fonte, além de possibilitar a distribuição de uma chave simétrica de sessão. Para quebrar estes criptossistemas, um computador quântico grande (2000 qubits) é necessário. Todavia, alternativas começaram a ser investigadas. As primeiras respostas vieram da própria mecânica quântica. Apesar das propriedades interessantes encontradas na criptografia quântica, um criptossistema completo parece inatingível, principalmente devido as assinaturas digitais, essenciais para a autenticação. Foram então propostos criptossitemas baseadas em problemas puramente clássicos que (acredita-se) não são tratáveis por computadores quânticos, que são chamadas de pós-quânticas. Estes sistemas ainda sofrem da falta de praticidade, seja devido ao tamanho das chaves ou ao tempo de processamento. Dentre os criptossistemas pós-quânticos, destacam-se o McEliece e o Niederreiter. Por si só, nenhum deles prevê assinaturas digitais, no entanto, as assinaturas CFS foram propostas, complementandos. Ainda que computadores quânticos de propósito geral estejam longe de nossa realidade, é possível imaginar um circuito quântico pequeno e dedicado. A melhoria trazida por ele seria a diferença necessária para tornar essas assinaturas práticas em um cenário legitimamente pós-quântico. Neste trabalho, uma arquitetura híbrida quântica/clássica é proposta para acelerar algoritmos criptográficos pós-quânticos. Dois coprocessadores quânticos, implementando a busca de Grover, são propostos: um para auxiliar o processo de decodificação de códigos de Goppa, no contexto do criptossistema McEliece; outro para auxiliar na busca por síndromes decodificáveis, no contexto das assinaturas CFS. Os resultados mostram que em alguns casos, o uso de um coprocessador quântico permite ganhos de até 99; 7% no tamanho da chave e até 76; 2% em tempo de processamento. Por se tratar de um circuito específico, realizando uma função bem específica, é possível manter um tamanho compacto (300 qubits, dependendo do que é acelerado), mostrando adicionalmente que, caso computadores quânticos venham a existir, eles viabilizarão os criptossistemas pós-quânticos antes de quebrar os criptossistemas pré-quânticos. Adicionalmente, algumas tecnologias de implementação de computadores quânticos são estudadas, com especial enfoque na óptica linear e nas tecnologias baseadas em silício. Este estudo busca avaliar a viabilidade destas tecnologias como potenciais candidatas à construção de um computador quântico completo e de caráter pessoal.The discovery of the Shor algorithm, which allows polynomial time factoring of integers, motivated efforts towards the implementation of a quantum computer. It is capable of breaking the main current public key cryptosystems used today (RSA and those based on elliptic curves). Those provide a set of security services, such as data confidentiality and integrity and source authentication, and also the distribution of a symmetric session key. To break those cryptosystem, a large quantum computer (2000 qubits) is needed. Nevertheless, cryptographers have started to look for alternatives. Some of which came from quantum mechanics itself. Despite some interesting properties found on quantum cryptography, a complete cryptosystem seems intangible, specially because of digital signatures, necessary to achieve authentication. Cryptosystems based on purely classical problems which are (believed) not treatable by quantum computers, called post-quantum, have them been proposed. Those systems still lacks of practicality, either because of the key size or the processing time. Among those post-quantum cryptosystems, specially the code based ones, the highlights are the McEliece and the Niederreiter cryptosystems. Per se, none of these provides digital signatures, but, the CFS signatures have been proposed, as a complement to them. Even if general purpose quantum computers are still far from our reality, it is possible to imagine a small dedicated quantum circuit. The benefits brought by it could make the deference to allow those signatures, in a truly post-quantum scenario. In this work, a quantum/classical hybrid architecture is proposed to accelerate post-quantum cryptographic algorithms. Two quantum coprocessors, implementing the Grover search, are proposed: one to assist the decoding process of Goppa codes, in the context of the McEliece and Niederreiter cryptosystems; another to assist the search for decodable syndromes, in the context of the CFS digital signatures. The results show that, for some cases, the use of the quantum coprocessor allows up to 99; 7% reduction in the key size and up to 76; 2% acceleration in the processing time. As a specific circuit, dealing with a well defined function, it is possible to keep a small size (300 qubits), depending on what is accelerated), showing that, if quantum computers come to existence, they will make post-quantum cryptosystems practical before breaking the current cryptosystems. Additionally, some implementation technologies of quantum computers are studied, in particular linear optics and silicon based technologies. This study aims to evaluate the feasibility of those technologies as potential candidates to the construction of a complete and personal quantum computer

    Improving the Productivity of Hardware Design

    No full text
    Current hardware development techniques contrast with agile methods that became popular in modern software development. This has been mitigated with technology scaling, when performance gains for every generation relied mostly on transistor shrinking. However, the end of Dennard’s scaling, the limitations in multicore design and with hardware accelerators emerging as an alternative to improve performance, hardware design has become an important bottleneck for chip developers. This is particularly important as application domain experts, who are not hardware designers, turn to hardware accelerators to make new technologies viable. In this dissertation, I discuss efforts to improve hardware design productivity: improving pipeline design and reducing synthesis runtime. Pipeline configuration is typically set very early in the design phase, which make changes costly. I proposed Fluid Pipelines, a novel design style that allows for changes in the number of pipeline stages late in the design cycle. To accurately evaluate the impact of pipeline changes, a designer needs to wait for synthesis results. I also proposed LiveSynth and SMatch, two incremental techniques that re-use existing synthesis results to drastically reduce synthesis time. Combined with work from others, I expect these techniques to ease design overhead and improve the adoption of domain specific hardware
    corecore